separation condition
- North America > United States > California > Los Angeles County > Los Angeles (0.29)
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
A Bayesian approach to learning mixtures of nonparametric components
Zhang, Yilei, Wei, Yun, Guha, Aritra, Nguyen, XuanLong
Mixture models are widely used in modeling heterogeneous data populations. A standard approach of mixture modeling is to assume that the mixture component takes a parametric kernel form, while the flexibility of the model can be obtained by using a large or possibly unbounded number of such parametric kernels. In many applications, making parametric assumptions on the latent subpopulation distributions may be unrealistic, which motivates the need for nonparametric modeling of the mixture components themselves. In this paper we study finite mixtures with nonparametric mixture components, using a Bayesian nonparametric modeling approach. In particular, it is assumed that the data population is generated according to a finite mixture of latent component distributions, where each component is endowed with a Bayesian nonparametric prior such as the Dirichlet process mixture. We present conditions under which the individual mixture component's distributions can be identified, and establish posterior contraction behavior for the data population's density, as well as densities of the latent mixture components. We develop an efficient MCMC algorithm for posterior inference and demonstrate via simulation studies and real-world data illustrations that it is possible to efficiently learn complex distributions for the latent subpopulations. In theory, the posterior contraction rate of the component densities is nearly polynomial, which is a significant improvement over the logarithm convergence rate of estimating mixing measures via deconvolution.
- North America > United States > Virginia > Fairfax County > Fairfax (0.04)
- North America > United States > Texas (0.04)
- North America > United States > Ohio (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.92)
Appendix for " Label consistency in overfitted generalized k-means "
Figure S1(a) shows a noisy circle-torus model (cf. To show the result, it is enough to use Theorem 2 with properly chosen (fake) centers on the above dataset. The code is provided as a ZIP file as part of the supplementary material. True clusters are distinguished by their color. We have the following extension of Proposition 3. Proposition S1.
- North America > United States > California > Los Angeles County > Los Angeles (0.29)
- Europe > Austria > Vienna (0.14)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > California > Alameda County > Oakland (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
Test-Time Regret Minimization in Meta Reinforcement Learning
Meta reinforcement learning sets a distribution over a set of tasks on which the agent can train at will, then is asked to learn an optimal policy for any test task efficiently. In this paper, we consider a finite set of tasks modeled through Markov decision processes with various dynamics. We assume to have endured a long training phase, from which the set of tasks is perfectly recovered, and we focus on regret minimization against the optimal policy in the unknown test task. Under a separation condition that states the existence of a state-action pair revealing a task against another, Chen et al. (2022) show that $O(M^2 \log(H))$ regret can be achieved, where $M, H$ are the number of tasks in the set and test episodes, respectively. In our first contribution, we demonstrate that the latter rate is nearly optimal by developing a novel lower bound for test-time regret minimization under separation, showing that a linear dependence with $M$ is unavoidable. Then, we present a family of stronger yet reasonable assumptions beyond separation, which we call strong identifiability, enabling algorithms achieving fast rates $\log (H)$ and sublinear dependence with $M$ simultaneously. Our paper provides a new understanding of the statistical barriers of test-time regret minimization and when fast rates can be achieved.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Leisure & Entertainment (0.67)
- Media > Television (0.45)
Elementary fractal geometry. 5. Weak separation is strong separation
Bandt, Christoph, Barnsley, Michael F.
For self-similar sets, there are two important separation properties: the open set condition and the weak separation condition introduced by Zerner, which may be replaced by the formally stronger finite type property of Ngai and Wang. We show that any finite type self-similar set can be represented as a graph-directed construction obeying the open set condition. The proof is based on a combinatorial algorithm which performed well in computer experiments.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > New York (0.04)
- (6 more...)
A Spectral Algorithm for Latent Dirichlet Allocation
Topic modeling is a generalization of clustering that posits that observations (words in a document) are generated by multiple latent factors (topics), as opposed to just one. This increased representational power comes at the cost of a more challenging unsupervised learning problem of estimating the topic-word distributions when only words are observed, and the topics are hidden. This work provides a simple and efficient learning procedure that is guaranteed to recover the parameters for a wide class of topic models, including Latent Dirichlet Allocation (LDA). For LDA, the procedure correctly recovers both the topic-word distributions and the parameters of the Dirichlet prior over the topic mixtures, using only trigram statistics (i.e., third order moments, which may be estimated with documents containing just three words). The method, called Excess Correlation Analysis, is based on a spectral decomposition of low-order moments via two singular value decompositions (SVDs). Moreover, the algorithm is scalable, since the SVDs are carried out only on k k matrices, where k is the number of latent factors (topics) and is typically much smaller than the dimension of the observation (word) space.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- South America > Paraguay > Asunción > Asunción (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (2 more...)
- Education (0.48)
- Government (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.96)
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Beurling-Selberg Extremization for Dual-Blind Deconvolution Recovery in Joint Radar-Communications
Monsalve, Jonathan, Vargas, Edwin, Mishra, Kumar Vijay, Sadler, Brian M., Arguello, Henry
Recent interest in integrated sensing and communications has led to the design of novel signal processing techniques to recover information from an overlaid radar-communications signal. Here, we focus on a spectral coexistence scenario, wherein the channels and transmit signals of both radar and communications systems are unknown to the common receiver. In this dual-blind deconvolution (DBD) problem, the receiver admits a multi-carrier wireless communications signal that is overlaid with the radar signal reflected off multiple targets. The communications and radar channels are represented by continuous-valued range-times or delays corresponding to multiple transmission paths and targets, respectively. Prior works addressed recovery of unknown channels and signals in this ill-posed DBD problem through atomic norm minimization but contingent on individual minimum separation conditions for radar and communications channels. In this paper, we provide an optimal joint separation condition using extremal functions from the Beurling-Selberg interpolation theory. Thereafter, we formulate DBD as a low-rank modified Hankel matrix retrieval and solve it via nuclear norm minimization. We estimate the unknown target and communications parameters from the recovered low-rank matrix using multiple signal classification (MUSIC) method. We show that the joint separation condition also guarantees that the underlying Vandermonde matrix for MUSIC is well-conditioned. Numerical experiments validate our theoretical findings.
- South America > Colombia > Santander Department > Bucaramanga (0.04)
- South America > Colombia > Bogotá D.C. > Bogotá (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- Information Technology > Artificial Intelligence (0.46)
- Information Technology > Communications > Networks (0.34)
On the robust learning mixtures of linear regressions
In this note, we consider the problem of robust learning mixtures of linear regressions. We connect mixtures of linear regressions and mixtures of Gaussians with a simple thresholding, so that a quasi-polynomial time algorithm can be obtained under some mild separation condition. This algorithm has significantly better robustness than the previous result.
- Asia > China (0.05)
- Asia > Middle East > Jordan (0.04)
A useful criterion on studying consistent estimation in community detection
In network analysis, developing a unified theoretical framework that can compare methods under different models is an interesting problem. This paper proposes a partial solution to this problem. We summarize the idea of using separation condition for a standard network and sharp threshold of Erd\"os-R\'enyi random graph to study consistent estimation, compare theoretical error rates and requirements on network sparsity of spectral methods under models that can degenerate to stochastic block model as a four-step criterion SCSTC. Using SCSTC, we find some inconsistent phenomena on separation condition and sharp threshold in community detection. Especially, we find original theoretical results of the SPACL algorithm introduced to estimate network memberships under the mixed membership stochastic blockmodel were sub-optimal. To find the formation mechanism of inconsistencies, we re-establish theoretical convergence rates of this algorithm by applying recent techniques on row-wise eigenvector deviation. The results are further extended to the degree corrected mixed membership model. By comparison, our results enjoy smaller error rates, lesser dependence on the number of communities, weaker requirements on network sparsity, and so forth. Furthermore, separation condition and sharp threshold obtained from our theoretical results match classical results, which shows the usefulness of this criterion on studying consistent estimation.
- North America > United States (0.14)
- Asia > China (0.04)